All Categories :
Servers
Chapter 21
Indexing Your Intranet with WAIS
CONTENTS
By now, you've made a good deal of data available on your Intranet-or
at least you have some ideas about what you want to put on the
Intranet. In all likelihood, your Intranet will eventually accumulate
a substantial volume of data. The obvious next question is how
are your customers going to be able to find anything among all
your data? The equally obvious answer is for you to provide searchable
indexes on your Intranet. You'll learn in this chapter how to
enable your customers to both search your indexes and retrieve
documents (or other data files) using their Web browsers.
This chapter talks about how WAIS works, how to install it, how
to use it, and how to search with it. At the end of the chapter,
I'll go over a few alternative indexing/searching technologies,
including Excite for Web Servers. In case you haven't heard of
Excite already, it is now available for Windows NT, it is coming
on strong, and it's free!
WAIS, which stands for Wide Area Information Server, is a system
for indexing large amounts of data and making them searchable
over a TCP/IP network. It's misnamed, though, because it works
just as well on a local network as it does over the Internet.
WAIS server software indexes data and responds to requests from
WAIS clients to search the indexes and return a list of documents
that match the search. Based on an ANSI standard for indexing
library materials in computer systems (Z39.50), WAIS can form
an important part of your Intranet. Because WAIS uses the Z39.50
protocol, you might hear the two terms used synonymously.
WAIS supports not only simple keyword searches, but also Boolean
queries (for example, thiskeyword
and thatkeyword) and even
plain English searches. In addition, WAIS can do relevance searching-you
can select part or all of a document that your WAIS search has
found and ask for a new search based on the selection. In other
words, WAIS will find more documents like the one it found.
WAIS was originally developed as free software at a company called
Thinking Machines, Incorporated. Then the software was commercialized
by WAIS, Inc., which is now owned by America Online (the Internet
buying frenzy continues). At the time of this writing, AOL has
not yet announced plans for WAIS, Inc. and its technology. Fortunately,
WAIS software for Windows NT is available for free through EMWAC,
and the good folks at Sams.net have arranged to include it on
the CD-ROM with this book.
Note |
In addition to the WAIS Toolkit, EMWAC has also developed a freeware Web server (HTTPS), Gopher server (GS), and SMTP server (IMS) for Windows NT. You can find more information about their highly regarded server software at this URL:
http://emwac.ed.ac.uk/
|
Although NCSA Mosaic has built-in WAIS client support, Netscape
and Explorer don't. As a result, you must run a WAIS gateway on
your Intranet to support users of those browsers. Fortunately,
the EMWAC WAIS Toolkit on the CD-ROM serves this purpose nicely.
You can use an HTML page as a front end to the WAIS search engine.
The EMWAC WAIS Toolkit returns the results of the search in HTML
format with matched documents as clickable hyperlinks. Once you
learn how to set this up, it works beautifully. Web searching
is definitely a slick feature to add to your Intranet. (And you
will soon see it is not hard to set up at all.)
Figure 21.1 shows a demonstration WAIS search result for the keyword
address. The results page is nicely formatted in HTML with
hyperlinks to each of the located documents. WAIS has applied
a best-guess score (maximum 1000) to each document for its potential
value to the user searching for the keyword. Documents containing
fewer occurrences of the word address are given lower scores.
The highest scoring documents appear at the top of the list. The
document with the search keyword contained in the HTML <TITLE>
tag is given a perfect score. WAIS also displays the file size
in bytes of each document, as that may help the user determine
which hyperlink jump to take.
Figure 21.1: The results of the WAIS search for the word address.
The relative weighting of the found documents in the WAIS search
results is based on a number of useful criteria, such as word
frequency within the individual documents and the index as a whole.
With multiword and Boolean searches, the weighting takes all the
search words into account, so a document containing all your search
words would get more weight than one that contained multiple instances
of just one of them, for example.
If your Intranet is like most others, much of the data you'll
want to index is in (or can be put into) plain text files of one
kind or another. WAIS understands a wide variety of text formats.
WAIS also knows about several kinds of image formats and can be
coaxed into indexing them (or at least their filenames).
In addition, the package has special features that make it easy
to integrate your data indexing into your Intranet, with a focus
on Web-related capabilities. One major source of data that you
may want to index is the data on your Web server itself. Finally,
as if these capabilities weren't enough, you can teach WAIS to
recognize and index new data formats.
Building a WAIS index is rather simple. The following is a quick
overview of the steps involved in using WAIS (this process will
be covered in more detail later in this chapter):
- Build a WAIS index of all the HTML files at your site by using
the program named waisindex.exe.
WAIS creates several files that comprise your index. If you name
the index myindex, for example,
you will end up with files named myindex.*.
- When using IIS, enable automatic WAIS searching by setting
CheckForWAISDB = 1 in the
Registry.
- Create a search page written in HTML in the same directory
as the WAIS index and using the same base filename (for example,
myindex.htm).
- Include the <ISINDEX>
tag in the <HEAD> section
of the HTML search page and provide a link from your home page
to this HTML document.
- When a user loads the search page, he is prompted for a search
keyword. After the user enters the word, the server automatically
invokes waislook.exe and
returns a list of matching documents. WAIS is that simple, and
your Intranet users will love you for the added functionality
it provides.
The EMWAC WAIS Toolkit included on the CD-ROM will help you create
a database of all the text at your Web site so that users can
search it by keyword. The creators of HTML designed the <ISINDEX>
tag with this feature in mind. The <ISINDEX>
tag causes the Web server to invoke a program named waislook
to search a WAIS database and return links to the pages containing
the search keyword. (The WAIS database is also referred to as
an index.)
Note |
The European Microsoft Windows NT Academic Centre (EMWAC) has developed several excellent freeware programs for Windows NT. Programs that are written for Windows NT on the Intel platform will usually run on Windows 95 also. This is because both Windows NT and Windows 95 support the common Win32 API, which enables programs to call functions in the operating system in a consistent manner using 32-bit parameters for integers and resource handles.
|
Follow these steps to install the EMWAC WAIS Toolkit:
- The WAIS Toolkit is distributed in four versions for the different
architectures that Windows NT supports. Select the appropriate
WAIS ZIP file on the CD-ROM for your processor. For example, the
WAIS Toolkit for Intel is contained in the file wti386.zip.
- Decide which directory you are going to put the tools in so
you can unzip the .EXE programs directly from the CD-ROM to your
hard disk using the WinZip program. Ensure that the directory
you chose is on the path so that the commands may be executed
from the command line.
- Unzip the WAIS Toolkit. This action should leave you with
the following files:
- The waisindx.exe
program builds the index of documents on the server. Note in the
EMWAC distribution that the name of the executable file was shortened
from waisindex to waisindx
for compatibility with older versions of Windows that only allowed
eight-character filenames. You can rename this program from its
8.3 filename to waisindex.exe
in order to match the EMWAC documentation.
- The waislook.exe
file is the searching program invoked by the Web server when an
HTML page includes <ISINDEX>.
- The waisserv.exe
file is the Z39.50 searching server program. You won't need this
file unless you plan to run WAIS clients. Users running Navigator
and Explorer will depend on the server invoking the waislook
program on their behalf.
- The waistool.doc
file is the WAIS Toolkit manual in Word for Windows format.
- The waistool.wri
file is the WAIS Toolkit manual in Windows Write format.
- The waistool.ps
file is the WAIS Toolkit manual in PostScript and is ready for
printing.
- The read.me
file is a summary of features.
- If you have installed a previous version of the WAIS Toolkit,
remove it by deleting the old files or by moving them to another
directory (which is not referred to by the PATH
environment variable) for deletion after you have validated that
the new version works correctly.
- Determine which version of the WAIS Toolkit you have by typing
these commands at the DOS Prompt:
waisserv -v
waisindx -v
waislook -v
The version number for each program will be displayed. Two version
numbers will be shown for waisindx and waisserv; the first refers
to the version of the freeWAIS code from which the programs were
ported, and the second is the number of the Win32 version. As
you can see in Figure 21.2, which shows the execution of those
commands on my system, I am running version 0.73 for Windows NT.
If the programs report a later version number on your system,
you will find an updated manual in the files you unpacked from
the ZIP archive (the information in this chapter would still be
expected to work with few or no changes).
Figure 21.2: The results of checking the WAIS version numbers.
To create a WAIS database of the HTML files at your site, follow
these steps: (Assume for the purposes of this discussion that
d:\http is the home directory
of your Web site.)
- Make d:\http, or the
HTML root, the current directory.
- Execute waisindx (or
waisindex, if you have renamed
it to use long filenames), giving it parameters as shown in the
following code. The -d parameter
is used to name the index files which are created. The default
name if no parameter is given is index,
which I will assume is in use for the remainder of this chapter.
The -r parameter tells WAIS
to search all subdirectories. The -t
(lowercase) parameter indicates the type of files being indexed.
WAIS handles text files and HTML with ease. If you know all the
files are HTML, WAIS will use the <TITLE>
tags for the file headlines. The last parameter specifies the
files that you want to search, which are, in this case, all HTML
files in the HTML root directory.
waisindx -d index -r -t html *.htm*
- Observe the messages from waisindx
to check that there are no errors.
- Execute a dir index.*
command on the d:\http directory
to check that waisindx has
created the seven index files, named index.*
and described in the following text.
The following text describes the files created by waisindx:
- The index.cat
file is the catalog of the indexed files, with about three lines
of information for each file indexed. This is a text file.
- The index.dct
file is the dictionary of indexed words. This is a binary file.
- The index.doc
file is the document table. This binary file may contain several
documents, depending on the type specified in the -t
option.
- The index.fn
file is the filename table. This is a binary file. The filenames
stored in this table are supplied as the final parameters to waisindx.
Thus, if filenames are supplied relative to the current directory
(for example, files/*), they
will be stored in the filename table in that form, and the resulting
filenames from a database search will also be in relative form.
- The index.hl
file is the headline table and a binary file. A headline
is (ideally) a line of descriptive text summarizing the contents
of a document. The headline is normally taken from the document
itself. For instance, it might be the Subject line if the document
is a mail message, the first line of the file, or the filename
itself. Which it is depends on the type of the file, as notified
to waisindx using the -t
option. If you use -t HTML,
waisindx will use the HTML
<TITLE> tag to generate
this headline.
- The index.inv
file is the inverted file index. This is a binary file.
- The index.src
is the source description structure. This is a text file.
Using <ISINDEX>
with WAIS
Now that the WAIS index files are created, you need to modify
your HTML code to take advantage of them. This is where the HTML
<ISINDEX> tag enters
the picture. Remember, the HTTP server is designed to automatically
invoke waislook whenever
it receives an <ISINDEX>
request from the client.
This automatic invocation of waislook
should not be taken for granted. I've only seen this work with
three Web servers: Process Purveyor for NT, EMWAC HTTPS, and Microsoft
IIS. Other Web servers, such as Alibaba, require a different procedure
to take advantage of <ISINDEX>.
I won't get into the details of that procedure in this chapter,
but I can point you to Richard Graessler's home page, which contains
thorough information about the topic. Mr. Graessler has written
a very nice batch script that can be used on Windows NT to pass
<ISINDEX> search parameters
to waislook on Alibaba or
other Web servers. He kindly provides the source code free of
charge on his Web site at this URL:
http://rick.wzl.rwth-aachen.de/rickg/IsIndex/isindex.html
Because much of this book is based upon Microsoft IIS, I will
assume you are using that Web server. In that case, there is a
Registry setting that you must ensure is set properly. Follow
these steps:
- Start the Registry Editor and drill down to the following
key:
HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\W3SVC\Parameters
- Look in the right-side window pane to see if you already have
a value named CheckForWAISDB.
If so, and if it has a value of 1,
then IIS is ready to invoke waislook.
- If you don't have a value named CheckForWAISDB,
choose Edit | Add Value. Type in the Value Name and choose REG_DWORD
for the Data Type.
- After you choose OK, the DWORD Editor dialog box will prompt
you for the initial value. Enter a value of 1,
and choose OK again.
- Check that the value is entered correctly, and then exit from
the Registry Editor. Now IIS will support <ISINDEX>
searches using the WAIS Toolkit.
Note |
These steps are not necessary with the EMWAC HTTPS Web server because it is capable of automatically invoking waislook.exe.
|
The next step is to create a new search page named index.htm
that contains the <ISINDEX>
tag in the <HEAD> section.
Figure 21.3 shows how the <ISINDEX>
tag is interpreted by Microsoft Explorer. The user is preparing
a search for the keyword address, the results of which
were shown earlier in the chapter. Listing 21.1 contains the HTML
code for the sample page shown in Figure 21.3. You can find this
HTML file on the CD-ROM.
Figure 21.3: The <ISINDEX> tag as it appears in Microsoft Internet Explorer 2.0.
Listing 21.1. The code for index.htm,
a sample HTML file that uses <ISINDEX>.
<HTML>
<HEAD>
<TITLE>Search the Intranet</TITLE>
<ISINDEX>
</HEAD>
<BODY>
<H1>Search the Intranet</H1>
</BODY>
</HTML>
Now you just need to provide a link from your Intranet home page
to the new index file you just created. After you do that, your
site will be searchable by keyword. Use your Web browser and give
it a try.
Examining waisindex
Table 21.1 lists the waisindex
command-line options, annotated to indicate which are required
and which are optional. After this list, you'll find a bit more
detail about each option that isn't self-explanatory.
Table 21.1. waisindex
command-line options.
Option | Description
|
-a |
Adds to existing WAIS index. Optional. |
-d database
| Specifies database name for WAIS index. Optional; defaults to index.* if -d not present.
|
-r |
Recursively indexes subdirectories. Optional. |
-mem mbytes
| Specifies the amount of memory in megabytes to use in creating the database. Optional.
|
-register
| The Windows NT version of waisindex cannot automatically register the database with the master Internet directory of servers. This option displays instructions on how to do it manually. Optional.
|
-export
| Makes database network accessible, outside the Intranet. Optional.
|
-e filename
| Logs errors in filename. Optional.
|
-l number
| Sets log level (0 through 10). Optional.
|
-v |
Prints the version of the software. Optional. |
-stdin
| Reads filenames to be indexed from standard input. Optional.
|
-pos or -nopos
| Includes (or doesn't include) word position information. Optional.
|
-nopairs or -pairs
| Doesn't include (or includes) word pairs. Optional.
|
-nocat
| Doesn't create catalog files. Optional. |
-contents
| Indexes the contents, even if the document type is not normally subject to such indexing. Optional.
|
-nocontents
| Indexes only the filename, not the contents, even if the contents are normally indexable. Optional.
|
-keywords string
| Uses string as keyword(s) in indexing. Optional.
|
-keyword_file filename
| Takes indexing keyword from filename. Optional.
|
-x filename1[,f2,...]
| Does not index these files. |
-T type
| Announces "TYPE" of the document. Optional.
|
-M type,type
| Specifies multitype documents. Optional. |
-t type
| Specifies actual type of the files. Optional.
|
The first two options (-a
and -d) control whether a
new database is created or an existing one is appended to.
By default, waisindex will
index only the files you specify. If you use the -r
switch, it will recursively index all the subdirectories and files
underneath the starting directory.
You can speed up waisindex
by giving it a large -mem
parameter. Expressed in megabytes, this parameter is the amount
of your system's virtual memory (not physical RAM) to be used
in creating index databases. Using too high a number here might
interfere with the computer's other tasks, so be careful if your
system is busy. Running your indexing jobs in off hours, when
the system is less busy, can enable you to use more memory. If
the default memory utilization (with no -mem
specification at all on the waisindex
command line) slows your system down, use this argument to limit
the amount of memory used rather than to increase it. This option
should only be necessary for indexing large Intranets.
The two related options -register
and -export might seem similar,
but they do entirely different things. In order for you to make
your WAIS index database fully searchable by WAIS clients, you
must use the -export option.
This option modifies the database.src
file, making it possible for stand-alone WAIS clients to access
the database over a network. Web browsers or CGI gateway scripts
like waislook don't need
this information.
Using the -export option
does not advertise your index database to the Internet. This is
what -register does. In freeWAIS,
the -register option creates
and sends an e-mail message to two main WAIS index registries
on the Internet. In EMWAC WAIS, this option only tells you how
to advertise your index database. The effect is a public Internet
announcement that your index is available to be searched from
anywhere. If you don't want to make your index universally available,
don't use this option. Also, if your network is not connected
to the Internet or is behind a network security firewall, the
-register option is unlikely
to be of any use.
Two options, -e and -l,
enable you to control whether your WAIS server will create logfiles
of its transactions (all the searches that are done) on your index.
In addition, you can control how much logging takes place. The
first option (-e logfile)
tells the server that you want a log kept in the file logfile.
By default, if you have logging enabled, the most verbose logging
is done. To reduce the amount of information that's logged, use
the -l option with a number
between 0 and 9. (Level 10 logging is the default if -e
is used alone.) The lower the number, the less verbose the logging.
If you use default logging, watch the size of your logfiles to
ensure that they don't fill up your disk.
Rather than typing in a list of filenames on the waisindex
command line, you may want to use other command-line utilities
to prepare a list of files for you based on some criteria. You
can then feed that list to waisindex
using the -stdin option.
One of the files created by waisindex
is known as the catalog file. This file contains the headline
of every document in a WAIS index database. If your database is
large, this file can get quite large. It's really nothing more
than a long list of the files in your database, annotated with
a descriptive headline. Failed searches may result in the headline
file being returned to your customer, and a long list of headlines
may or may not be helpful. The catalog file is not required for
the WAIS server to function or for your customers to do searches,
so you can dispense with it if you're short on disk space by running
waisindex with the -nocat
option.
Ordinarily, waisindex knows
that there are some kinds of files whose contents can't usefully
be indexed. Examples include image files and other kinds of binary
data. Based on the -t option,
for example, waisindex will
index the contents of several kinds of text files that it knows
about. If, on the other hand, you'd like to inhibit content indexing
of ordinarily indexable files, use -nocontents.
If you want to make sure that your WAIS index database contains
specific keywords, even if some or all of the documents don't
contain them, use -keywords string
and specify the keywords on the command line, or use -keyword_file
filename and specify them in a file. Your extra
keywords will be added to the normal indexing. This feature is
useful when indexing image filenames and other binary data.
The -T and -t
options are confusing because they both appear to specify a document
type. The difference is subtle but important. You can think of
the two as specifying a document format and a document type, respectively.
The waisindex program has
a built-in list of the document types it recognizes. You can get
this list by entering the waisindex
command with no options at all on your command line. For the most
part, these are types of plain text files whose internal file
format waisindex understands
and can interpret. Examples include Usenet news articles and e-mail
messages. The program expects such files to conform to the standard
format of those kinds of files, with a certain layout and structure.
Thus, the -t option deals
with the format of documents-how they're laid out, what divides
records, and the like.
As you'll also recall, Web servers and browsers know about a list
of MIME data type/subtypes. This is where the -T
option to waisindex comes
in. Because WAIS is built to integrate into a Web, it has MIME
hooks built in. When you index data with waisindex,
you can use the -T command-line
option to specify a MIME type that will be announced when your
index is searched by a Web browser or CGI script. When a Web browser
or CGI gateway script retrieves the document, the MIME type is
returned, and your Web browser deals with it appropriately. Thus,
if you index JPEG image files using -T
JPEG on the waisindex
command line, your customers' Web browsers will know to open the
files they retrieve from your WAIS server as JPEG images.
Note |
I*n some instances, the -T and -t options appear to have the same file type specified. For example, because waisindex knows about GIF images, you might specify -T gif and t gif on the same command line when indexing GIF files. Because the two options mean different things, their use isn't redundant.
|
Tip |
When using both the -t and -T options with waisindex, always put the -t option first on your command line. In some cases, -t may imply a -T because the overall default for T is TEXT, so you may not need both options.
|
In connection with MIME types, the -M
option to waisindex enables
you to specify multiple file types in a single WAIS index database.
Suppose you maintain copies of common word processing documents
in several formats, including Microsoft Word, WordPerfect, rich
text, and plain text. Using the -M
option, you can index all these documents at once using a waisindex
command line-something like the following:
waisindex -d mywords -M MSWORD,WORDPERFECT5.1,RTF,TEXT *.*
In this line, the multiple file types correspond to some of the
additions you have made to the mime.types
file on your Web server over the course of the last several chapters.
Note that you must specify them on the waisindex
command line in uppercase letters.
Indexing Images and Other Document Types
Most Web servers have more than just plain text documents on them.
In particular, Web servers have HTML documents, images, and other
multimedia files on them. So why not extend your WAIS index database
to add these important files?
Suppose your Web server includes a directory tree containing not
only the text files you've already indexed, but also one or more
subdirectories containing HTML files. If the contents of your
images aren't indexed, you may wonder what image indexing will
add to your WAIS index database. Well, all the filenames of the
GIF images in your Web server's file tree are indexed (along with
any associated keywords you've added with the -keywords
option) so that now you and your Intranet's customers can search
for image files the same way you do keyword searches.
As you create more and more HTML documents, you'll collect more
and more images; running waisindex
on them enables you to manage them better. CGI gateway programs
like waislook can help you
and your Intranet's customers search for image files just as they
can help you look for text files. If you have multiple Webmasters
including customers setting up Web servers of their own for your
Intranet, having a searchable, retrievable collection of images
can be a boon. Everyone can share the same set of images, preventing
duplicate work and giving your Intranet a common look.
As you've probably guessed, the technique of indexing filenames
without indexing their contents, as just discussed with images,
can be used for almost any kind of binary data on your Web server.
You can index any set of binary files for easy search and retrieval,
saving you the time and trouble of maintaining Tables of Contents
as documents change.
Tip |
When using waisindex to index word processor, spreadsheet, or other data files, be sure to use the -keywords option to add key search words to your index. These documents' contents may not get fully indexed, so you'll want to use this important feature.
|
Using the same search form shown in Figure 21.3 (where the keyword
address was used), you could obtain results showing not
only the plain text versions of each file found, but also the
original .doc file. Because
your customers' Web browsers are already configured to use Word
as a helper application (see Chapter 13,
"Word Processing on the Web"), they can click the document
they want and load it directly into Word.
A growing number of companies are coming out with commercial software
packages for creating Web-searchable index databases for Intranets.
The following sections sample several of the other commercial
packages.
Fulcrum Surfboard
A long-time maker of full-text search technologies, Fulcrum, Inc.
now has a Web-based product called Surfboard. Surfboard 2.0 for
Windows NT can search both local and network indexes and can search
multiple indexes in a single pass. You can use natural language,
multiword phrases, fielded searches, wildcard word matching (such
as comput* to match computers,
computing, computation, and the like), and Boolean
constructs. It also supports relevance searching. In addition,
you can specify the kind of output you'd like from your search,
with choices including listing or tabular arrangement, HTML, plain
text, or document native format, and you have several choices
for sorting. You'll find more information about Surfboard and
other interesting Fulcrum products for Windows NT at this URL:
http://www.fulcrum.com/english/products/prodhome.htm
Verity Topic
Topic is another product suite consisting of eight products and
including both an Enterprise and an Internet indexing/search engine.
The former supports major office applications' data file formats,
and the latter adds support for HTML documents on a Web. Both
search engines support so-called fuzzy-logic searches, as well
as concept, weighted, and Boolean searches. Following the overall
structure of the Topic system, the Topic client is not a Web browser,
but a stand-alone application, and is available for Windows NT.
Figure 21.4 shows a demo of Topic searches. You can run the demo
at this URL:
http://www.verity.com/demo/d/Topic_Demos/tisdemo.html
Figure 21.4: The Topic Internet Server search demo.
Architext Excite
Excite for Web Servers (EWS) enables users to search multiple
database indexes and includes both concept and keyword searches.
Queries can be natural language, with search results sorted by
what Excite calls Confidence (similar to other weighted relevance
searching). It provides a user-friendly fill-in search form (as
do other packages mentioned in this chapter).
Excite's primary distinction is that it's available for no-cost
download. You can retrieve it from this URL:
http://www.excite.com/navigate/download.cgi
Excite is available for Windows NT and several UNIX systems. The
licensing document that comes with the downloadable package indicates
that Excite can be used internally without any charge, although
you are requested to register the package. (You need only supply
your e-mail address to download it.) No support comes with the
free package, but support contracts, which include future upgrades,
e-mail, and phone support, are available for purchase. Currently,
maintenance agreements for EWS are sold for $995 per year.
Excite supports "concept-based searching," which is
a technology made possible by the way EWS goes through its indexing
process. It uses probabilistic techniques to analyze the interrelationships
between words within a collection of documents. This index supports
concept-based capabilities such as finding relevant documents
that do not even contain the words used in the query statement
and improving the ranking of the returned documents so that the
most important documents are shown to the user first, even when
thousands of documents are found.
Currently EWS only supports ASCII and HTML documents, but Architext
has stated that this restriction will be lifted in the near future.
With what you now know about document conversion, that limitation
can be considered an inconvenience, but not a show-stopper.
PLWeb
Another index-and-retrieval package for Windows NT, Personal Library
Software's PLWeb, is available for no-cost 45-day evaluations
to registered users. See http://www.pls.com
for details of the offer. A demonstration is online there, but
you may want to look at what some of PLS's customers are doing
with the package. For example, Figure 21.5 shows AT&T's searchable
Toll-Free Internet Directory at http://www.att.net/.
Figure 21.5: The AT&T Toll-Free Internet Directory gives Web junkies a quick way to search for 800 numbers.
Focusing on indexing and retrieving data on your Intranet, this
chapter has covered general-purpose indexing packages that can
be accessed using a Web browser. You've learned how to index your
data and how to provide Web-browser interfaces to enable your
customers to search and retrieve data from them. In addition,
you've learned about a specialized database package that you can
use to maintain an online corporate telephone directory for your
customers. Finally, you surveyed the market of commercial software
providing index-and-retrieval features.
The next part of the book, "Sample Applications," is
geared toward typical business uses of an Intranet. The next several
chapters will pull together all that you have learned about Web
technologies and Web tools.

Contact
reference@developer.com with questions or comments.
Copyright 1998
EarthWeb Inc., All rights reserved.
PLEASE READ THE ACCEPTABLE USAGE STATEMENT.
Copyright 1998 Macmillan Computer Publishing. All rights reserved.